首页> 外文OA文献 >EPR-dictionaries: A practical and fast data structure for constant time searches in unidirectional and bidirectional FM-indices
【2h】

EPR-dictionaries: A practical and fast data structure for constant time searches in unidirectional and bidirectional FM-indices

机译:EpR-dictionaries:一种实用且快速的数据结构,用于恒定时间   搜索单向和双向Fm索引

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We introduce a new, practical method for conducting an exact search in a uni-and bidirectional FM index in $O(1)$ time per step while using $O(\log \sigma *n) + o(\log \sigma * \sigma * n)$ bits of space. This is done by replacing thebinary wavelet tree by a new data structure, the Enhanced Prefixsum Rankdictionary (EPR-dictionary). We implemented this method in the SeqAn C++library and experimentally validated our theoretical results. In addition wecompared our implementation with other freely available implementations ofbidirectional indices and show that we are between $\approx 2.6-4.8$ timesfaster. This will have a large impact for many bioinformatics applications thatrely on practical implementations of (2)FM indices e.g. for read mapping. Toour knowledge this is the first implementation of a constant time method for asearch step in 2FM indices.
机译:我们介绍一种新的实用方法,以便在单向和双向FM索引中以每步$ O(1)$的时间进行精确搜索,同时使用$ O(\ log \ sigma * n)+ o(\ log \ sigma * \ sigma * n)$空格位。这是通过用新的数据结构,即增强的前缀和排名(EPR-dictionary)替换二进制小波树来完成的。我们在SeqAn C ++库中实现了此方法,并通过实验验证了我们的理论结果。另外,我们将实现与双向索引的其他免费提供的实现进行了比较,表明我们的速度快了2.6到4.8美元之间。这将对许多依赖于(2)FM索引的实际实现的生物信息学应用产生重大影响。用于读取映射。众所周知,这是2FM索引中搜索步骤的恒定时间方法的首次实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号